Overview

Dataset statistics

Number of variables20
Number of observations1017209
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory179.7 MiB
Average record size in memory185.2 B

Variable types

Numeric9
DateTime1
Categorical9
Unsupported1

Alerts

day_of_week is highly overall correlated with openHigh correlation
sales is highly overall correlated with customers and 1 other fieldsHigh correlation
customers is highly overall correlated with salesHigh correlation
promo2_since_year is highly overall correlated with promo2High correlation
open is highly overall correlated with day_of_week and 1 other fieldsHigh correlation
store_type is highly overall correlated with assortmentHigh correlation
assortment is highly overall correlated with store_typeHigh correlation
promo2 is highly overall correlated with promo2_since_yearHigh correlation
state_holiday is highly imbalanced (88.2%)Imbalance
promo_interval is an unsupported type, check if it needs cleaning or further analysisUnsupported
sales has 172871 (17.0%) zerosZeros
customers has 172869 (17.0%) zerosZeros

Reproduction

Analysis started2023-09-22 18:41:09.556427
Analysis finished2023-09-22 18:42:08.863303
Duration59.31 seconds
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

store
Real number (ℝ)

Distinct1115
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean558.42973
Minimum1
Maximum1115
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.5 MiB
2023-09-22T19:42:08.950352image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile56
Q1280
median558
Q3838
95-th percentile1060
Maximum1115
Range1114
Interquartile range (IQR)558

Descriptive statistics

Standard deviation321.90865
Coefficient of variation (CV)0.57645329
Kurtosis-1.2005237
Mean558.42973
Median Absolute Deviation (MAD)279
Skewness-0.00095487998
Sum5.6803974 × 108
Variance103625.18
MonotonicityNot monotonic
2023-09-22T19:42:09.051899image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 942
 
0.1%
726 942
 
0.1%
708 942
 
0.1%
709 942
 
0.1%
713 942
 
0.1%
714 942
 
0.1%
715 942
 
0.1%
717 942
 
0.1%
718 942
 
0.1%
720 942
 
0.1%
Other values (1105) 1007789
99.1%
ValueCountFrequency (%)
1 942
0.1%
2 942
0.1%
3 942
0.1%
4 942
0.1%
5 942
0.1%
6 942
0.1%
7 942
0.1%
8 942
0.1%
9 942
0.1%
10 942
0.1%
ValueCountFrequency (%)
1115 942
0.1%
1114 942
0.1%
1113 942
0.1%
1112 942
0.1%
1111 942
0.1%
1110 942
0.1%
1109 758
0.1%
1108 942
0.1%
1107 758
0.1%
1106 942
0.1%

day_of_week
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9983406
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.5 MiB
2023-09-22T19:42:09.136411image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.997391
Coefficient of variation (CV)0.49955499
Kurtosis-1.2468733
Mean3.9983406
Median Absolute Deviation (MAD)2
Skewness0.0015928228
Sum4067148
Variance3.9895707
MonotonicityNot monotonic
2023-09-22T19:42:09.205933image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 145845
14.3%
4 145845
14.3%
3 145665
14.3%
2 145664
14.3%
1 144730
14.2%
7 144730
14.2%
6 144730
14.2%
ValueCountFrequency (%)
1 144730
14.2%
2 145664
14.3%
3 145665
14.3%
4 145845
14.3%
5 145845
14.3%
6 144730
14.2%
7 144730
14.2%
ValueCountFrequency (%)
7 144730
14.2%
6 144730
14.2%
5 145845
14.3%
4 145845
14.3%
3 145665
14.3%
2 145664
14.3%
1 144730
14.2%

date
Date

Distinct942
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
Minimum2013-01-01 00:00:00
Maximum2015-07-31 00:00:00
2023-09-22T19:42:09.310017image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:09.423568image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

sales
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct21734
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5773.819
Minimum0
Maximum41551
Zeros172871
Zeros (%)17.0%
Negative0
Negative (%)0.0%
Memory size15.5 MiB
2023-09-22T19:42:09.543579image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13727
median5744
Q37856
95-th percentile12137
Maximum41551
Range41551
Interquartile range (IQR)4129

Descriptive statistics

Standard deviation3849.9262
Coefficient of variation (CV)0.66679025
Kurtosis1.7783747
Mean5773.819
Median Absolute Deviation (MAD)2067
Skewness0.64145962
Sum5.8731806 × 109
Variance14821932
MonotonicityNot monotonic
2023-09-22T19:42:09.648143image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 172871
 
17.0%
5674 215
 
< 0.1%
5558 197
 
< 0.1%
5483 196
 
< 0.1%
6214 195
 
< 0.1%
6049 195
 
< 0.1%
5723 194
 
< 0.1%
5449 192
 
< 0.1%
5140 191
 
< 0.1%
5489 191
 
< 0.1%
Other values (21724) 842572
82.8%
ValueCountFrequency (%)
0 172871
17.0%
46 1
 
< 0.1%
124 1
 
< 0.1%
133 1
 
< 0.1%
286 1
 
< 0.1%
297 1
 
< 0.1%
316 1
 
< 0.1%
416 1
 
< 0.1%
506 1
 
< 0.1%
520 1
 
< 0.1%
ValueCountFrequency (%)
41551 1
< 0.1%
38722 1
< 0.1%
38484 1
< 0.1%
38367 1
< 0.1%
38037 1
< 0.1%
38025 1
< 0.1%
37646 1
< 0.1%
37403 1
< 0.1%
37376 1
< 0.1%
37122 1
< 0.1%

customers
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct4086
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean633.14595
Minimum0
Maximum7388
Zeros172869
Zeros (%)17.0%
Negative0
Negative (%)0.0%
Memory size15.5 MiB
2023-09-22T19:42:09.752050image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1405
median609
Q3837
95-th percentile1362
Maximum7388
Range7388
Interquartile range (IQR)432

Descriptive statistics

Standard deviation464.41173
Coefficient of variation (CV)0.73349871
Kurtosis7.0917727
Mean633.14595
Median Absolute Deviation (MAD)216
Skewness1.5986503
Sum6.4404176 × 108
Variance215678.26
MonotonicityNot monotonic
2023-09-22T19:42:09.861026image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 172869
 
17.0%
560 2414
 
0.2%
576 2363
 
0.2%
603 2337
 
0.2%
571 2330
 
0.2%
555 2328
 
0.2%
566 2327
 
0.2%
517 2326
 
0.2%
539 2309
 
0.2%
651 2299
 
0.2%
Other values (4076) 823307
80.9%
ValueCountFrequency (%)
0 172869
17.0%
3 1
 
< 0.1%
5 1
 
< 0.1%
8 1
 
< 0.1%
13 1
 
< 0.1%
18 1
 
< 0.1%
36 1
 
< 0.1%
40 1
 
< 0.1%
44 1
 
< 0.1%
50 1
 
< 0.1%
ValueCountFrequency (%)
7388 1
< 0.1%
5494 1
< 0.1%
5458 1
< 0.1%
5387 1
< 0.1%
5297 1
< 0.1%
5192 1
< 0.1%
5152 1
< 0.1%
5145 1
< 0.1%
5132 1
< 0.1%
5112 1
< 0.1%

open
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
1
844392 
0
172817 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 844392
83.0%
0 172817
 
17.0%

Length

2023-09-22T19:42:09.977073image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:10.068639image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 844392
83.0%
0 172817
 
17.0%

Most occurring characters

ValueCountFrequency (%)
1 844392
83.0%
0 172817
 
17.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1017209
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 844392
83.0%
0 172817
 
17.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1017209
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 844392
83.0%
0 172817
 
17.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 844392
83.0%
0 172817
 
17.0%

promo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
629129 
1
388080 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 629129
61.8%
1 388080
38.2%

Length

2023-09-22T19:42:10.137621image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:10.221162image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 629129
61.8%
1 388080
38.2%

Most occurring characters

ValueCountFrequency (%)
0 629129
61.8%
1 388080
38.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1017209
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 629129
61.8%
1 388080
38.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1017209
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 629129
61.8%
1 388080
38.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 629129
61.8%
1 388080
38.2%

state_holiday
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
986159 
a
 
20260
b
 
6690
c
 
4100

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters4
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 986159
96.9%
a 20260
 
2.0%
b 6690
 
0.7%
c 4100
 
0.4%

Length

2023-09-22T19:42:10.294854image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:10.390332image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 986159
96.9%
a 20260
 
2.0%
b 6690
 
0.7%
c 4100
 
0.4%

Most occurring characters

ValueCountFrequency (%)
0 986159
96.9%
a 20260
 
2.0%
b 6690
 
0.7%
c 4100
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 986159
96.9%
Lowercase Letter 31050
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 20260
65.2%
b 6690
 
21.5%
c 4100
 
13.2%
Decimal Number
ValueCountFrequency (%)
0 986159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 986159
96.9%
Latin 31050
 
3.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 20260
65.2%
b 6690
 
21.5%
c 4100
 
13.2%
Common
ValueCountFrequency (%)
0 986159
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 986159
96.9%
a 20260
 
2.0%
b 6690
 
0.7%
c 4100
 
0.4%

school_holiday
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
835488 
1
181721 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 835488
82.1%
1 181721
 
17.9%

Length

2023-09-22T19:42:10.477842image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:10.572375image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 835488
82.1%
1 181721
 
17.9%

Most occurring characters

ValueCountFrequency (%)
0 835488
82.1%
1 181721
 
17.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1017209
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 835488
82.1%
1 181721
 
17.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1017209
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 835488
82.1%
1 181721
 
17.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 835488
82.1%
1 181721
 
17.9%

store_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
a
551627 
d
312912 
c
136840 
b
 
15830

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowc
2nd rowa
3rd rowa
4th rowc
5th rowa

Common Values

ValueCountFrequency (%)
a 551627
54.2%
d 312912
30.8%
c 136840
 
13.5%
b 15830
 
1.6%

Length

2023-09-22T19:42:10.650079image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:10.888853image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
a 551627
54.2%
d 312912
30.8%
c 136840
 
13.5%
b 15830
 
1.6%

Most occurring characters

ValueCountFrequency (%)
a 551627
54.2%
d 312912
30.8%
c 136840
 
13.5%
b 15830
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1017209
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 551627
54.2%
d 312912
30.8%
c 136840
 
13.5%
b 15830
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1017209
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 551627
54.2%
d 312912
30.8%
c 136840
 
13.5%
b 15830
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 551627
54.2%
d 312912
30.8%
c 136840
 
13.5%
b 15830
 
1.6%

assortment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
a
537445 
c
471470 
b
 
8294

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa
2nd rowa
3rd rowa
4th rowc
5th rowa

Common Values

ValueCountFrequency (%)
a 537445
52.8%
c 471470
46.3%
b 8294
 
0.8%

Length

2023-09-22T19:42:10.968891image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:11.072525image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
a 537445
52.8%
c 471470
46.3%
b 8294
 
0.8%

Most occurring characters

ValueCountFrequency (%)
a 537445
52.8%
c 471470
46.3%
b 8294
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1017209
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 537445
52.8%
c 471470
46.3%
b 8294
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 1017209
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 537445
52.8%
c 471470
46.3%
b 8294
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 537445
52.8%
c 471470
46.3%
b 8294
 
0.8%

competition_distance
Real number (ℝ)

Distinct655
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5935.4427
Minimum20
Maximum200000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.5 MiB
2023-09-22T19:42:11.188010image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile130
Q1710
median2330
Q36910
95-th percentile20620
Maximum200000
Range199980
Interquartile range (IQR)6200

Descriptive statistics

Standard deviation12547.653
Coefficient of variation (CV)2.1140214
Kurtosis147.78971
Mean5935.4427
Median Absolute Deviation (MAD)1980
Skewness10.242344
Sum6.0375857 × 109
Variance1.574436 × 108
MonotonicityNot monotonic
2023-09-22T19:42:11.344568image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
250 11120
 
1.1%
50 7536
 
0.7%
350 7536
 
0.7%
1200 7374
 
0.7%
190 7352
 
0.7%
180 6594
 
0.6%
90 6594
 
0.6%
330 6410
 
0.6%
150 6226
 
0.6%
2640 5652
 
0.6%
Other values (645) 944815
92.9%
ValueCountFrequency (%)
20 942
 
0.1%
30 3767
0.4%
40 4710
0.5%
50 7536
0.7%
60 2826
 
0.3%
70 4526
0.4%
80 2826
 
0.3%
90 6594
0.6%
100 4710
0.5%
110 5468
0.5%
ValueCountFrequency (%)
200000 2642
0.3%
75860 942
 
0.1%
58260 942
 
0.1%
48330 942
 
0.1%
46590 942
 
0.1%
45740 942
 
0.1%
44320 942
 
0.1%
40860 942
 
0.1%
40540 942
 
0.1%
38710 942
 
0.1%
Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7868491
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2023-09-22T19:42:11.484812image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.3110867
Coefficient of variation (CV)0.48786804
Kurtosis-1.2326075
Mean6.7868491
Median Absolute Deviation (MAD)3
Skewness-0.04207563
Sum6903644
Variance10.963295
MonotonicityNot monotonic
2023-09-22T19:42:11.567628image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
9 133844
13.2%
4 118936
11.7%
11 104045
10.2%
3 96470
9.5%
7 90651
8.9%
12 78139
7.7%
6 77304
7.6%
10 75865
7.5%
5 72530
7.1%
2 67622
6.6%
Other values (2) 101803
10.0%
ValueCountFrequency (%)
1 45374
 
4.5%
2 67622
6.6%
3 96470
9.5%
4 118936
11.7%
5 72530
7.1%
6 77304
7.6%
7 90651
8.9%
8 56429
5.5%
9 133844
13.2%
10 75865
7.5%
ValueCountFrequency (%)
12 78139
7.7%
11 104045
10.2%
10 75865
7.5%
9 133844
13.2%
8 56429
5.5%
7 90651
8.9%
6 77304
7.6%
5 72530
7.1%
4 118936
11.7%
3 96470
9.5%
Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.3248
Minimum1900
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2023-09-22T19:42:11.650820image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1900
5-th percentile2002
Q12008
median2012
Q32014
95-th percentile2015
Maximum2015
Range115
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.5155933
Coefficient of variation (CV)0.0027436329
Kurtosis124.0713
Mean2010.3248
Median Absolute Deviation (MAD)2
Skewness-7.2356575
Sum2.0449205 × 109
Variance30.421769
MonotonicityNot monotonic
2023-09-22T19:42:11.786703image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2013 204636
20.1%
2014 182822
18.0%
2015 110108
10.8%
2012 74299
 
7.3%
2005 56564
 
5.6%
2010 51258
 
5.0%
2009 49396
 
4.9%
2011 49396
 
4.9%
2008 48476
 
4.8%
2007 43744
 
4.3%
Other values (13) 146510
14.4%
ValueCountFrequency (%)
1900 758
 
0.1%
1961 942
 
0.1%
1990 4710
 
0.5%
1994 1884
 
0.2%
1995 1700
 
0.2%
1998 942
 
0.1%
1999 7352
 
0.7%
2000 9236
 
0.9%
2001 14704
1.4%
2002 24882
2.4%
ValueCountFrequency (%)
2015 110108
10.8%
2014 182822
18.0%
2013 204636
20.1%
2012 74299
 
7.3%
2011 49396
 
4.9%
2010 51258
 
5.0%
2009 49396
 
4.9%
2008 48476
 
4.8%
2007 43744
 
4.3%
2006 42802
 
4.2%

promo2
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
1
509178 
0
508031 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 509178
50.1%
0 508031
49.9%

Length

2023-09-22T19:42:11.923455image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:12.038320image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 509178
50.1%
0 508031
49.9%

Most occurring characters

ValueCountFrequency (%)
1 509178
50.1%
0 508031
49.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1017209
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 509178
50.1%
0 508031
49.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1017209
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 509178
50.1%
0 508031
49.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 509178
50.1%
0 508031
49.9%

promo2_since_week
Real number (ℝ)

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.619033
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2023-09-22T19:42:12.128782image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q112
median22
Q337
95-th percentile47
Maximum52
Range51
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.310064
Coefficient of variation (CV)0.60587003
Kurtosis-1.1840463
Mean23.619033
Median Absolute Deviation (MAD)12
Skewness0.17872254
Sum24025493
Variance204.77794
MonotonicityNot monotonic
2023-09-22T19:42:12.293883image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14 84414
 
8.3%
40 70046
 
6.9%
10 50252
 
4.9%
31 50144
 
4.9%
5 47242
 
4.6%
1 43225
 
4.2%
13 41244
 
4.1%
37 40234
 
4.0%
22 40118
 
3.9%
18 38742
 
3.8%
Other values (42) 511548
50.3%
ValueCountFrequency (%)
1 43225
4.2%
2 11424
 
1.1%
3 11424
 
1.1%
4 11424
 
1.1%
5 47242
4.6%
6 12366
 
1.2%
7 11424
 
1.1%
8 11424
 
1.1%
9 23876
2.3%
10 50252
4.9%
ValueCountFrequency (%)
52 7448
 
0.7%
51 7448
 
0.7%
50 8390
 
0.8%
49 8206
 
0.8%
48 15742
1.5%
47 7448
 
0.7%
46 7448
 
0.7%
45 36716
3.6%
44 10090
 
1.0%
43 7448
 
0.7%

promo2_since_year
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.7933
Minimum2009
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.6 MiB
2023-09-22T19:42:12.417460image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum2009
5-th percentile2009
Q12012
median2013
Q32014
95-th percentile2015
Maximum2015
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.6626582
Coefficient of variation (CV)0.00082604518
Kurtosis-0.21007513
Mean2012.7933
Median Absolute Deviation (MAD)1
Skewness-0.78443626
Sum2.0474315 × 109
Variance2.7644323
MonotonicityNot monotonic
2023-09-22T19:42:12.485170image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2013 309023
30.4%
2014 274066
26.9%
2015 124380
12.2%
2011 115056
 
11.3%
2012 73174
 
7.2%
2009 65270
 
6.4%
2010 56240
 
5.5%
ValueCountFrequency (%)
2009 65270
 
6.4%
2010 56240
 
5.5%
2011 115056
 
11.3%
2012 73174
 
7.2%
2013 309023
30.4%
2014 274066
26.9%
2015 124380
12.2%
ValueCountFrequency (%)
2015 124380
12.2%
2014 274066
26.9%
2013 309023
30.4%
2012 73174
 
7.2%
2011 115056
 
11.3%
2010 56240
 
5.5%
2009 65270
 
6.4%

promo_interval
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size15.5 MiB

month_map
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
May
103695 
Mar
103695 
Jan
103694 
Jun
100350 
Apr
100350 
Other values (7)
505425 

Length

Max length4
Median length3
Mean length3.0604596
Min length3

Characters and Unicode

Total characters3113127
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJul
2nd rowJul
3rd rowJul
4th rowJul
5th rowJul

Common Values

ValueCountFrequency (%)
May 103695
10.2%
Mar 103695
10.2%
Jan 103694
10.2%
Jun 100350
9.9%
Apr 100350
9.9%
Jul 98115
9.6%
Fev 93660
9.2%
Dec 63550
6.2%
Oct 63550
6.2%
Aug 63550
6.2%
Other values (2) 123000
12.1%

Length

2023-09-22T19:42:12.578057image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
may 103695
10.2%
mar 103695
10.2%
jan 103694
10.2%
jun 100350
9.9%
apr 100350
9.9%
jul 98115
9.6%
fev 93660
9.2%
dec 63550
6.2%
oct 63550
6.2%
aug 63550
6.2%
Other values (2) 123000
12.1%

Most occurring characters

ValueCountFrequency (%)
a 311084
 
10.0%
J 302159
 
9.7%
u 262015
 
8.4%
e 218710
 
7.0%
M 207390
 
6.7%
r 204045
 
6.6%
n 204044
 
6.6%
A 163900
 
5.3%
p 161850
 
5.2%
v 155160
 
5.0%
Other values (11) 922770
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2095918
67.3%
Uppercase Letter 1017209
32.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 311084
14.8%
u 262015
12.5%
e 218710
10.4%
r 204045
9.7%
n 204044
9.7%
p 161850
7.7%
v 155160
7.4%
c 127100
6.1%
t 125050
6.0%
y 103695
 
4.9%
Other values (3) 223165
10.6%
Uppercase Letter
ValueCountFrequency (%)
J 302159
29.7%
M 207390
20.4%
A 163900
16.1%
F 93660
 
9.2%
O 63550
 
6.2%
D 63550
 
6.2%
N 61500
 
6.0%
S 61500
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3113127
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 311084
 
10.0%
J 302159
 
9.7%
u 262015
 
8.4%
e 218710
 
7.0%
M 207390
 
6.7%
r 204045
 
6.6%
n 204044
 
6.6%
A 163900
 
5.3%
p 161850
 
5.2%
v 155160
 
5.0%
Other values (11) 922770
29.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3113127
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 311084
 
10.0%
J 302159
 
9.7%
u 262015
 
8.4%
e 218710
 
7.0%
M 207390
 
6.7%
r 204045
 
6.6%
n 204044
 
6.6%
A 163900
 
5.3%
p 161850
 
5.2%
v 155160
 
5.0%
Other values (11) 922770
29.6%

is_promo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.5 MiB
0
853337 
1
163872 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1017209
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 853337
83.9%
1 163872
 
16.1%

Length

2023-09-22T19:42:12.660851image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-22T19:42:12.757911image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 853337
83.9%
1 163872
 
16.1%

Most occurring characters

ValueCountFrequency (%)
0 853337
83.9%
1 163872
 
16.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1017209
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 853337
83.9%
1 163872
 
16.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1017209
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 853337
83.9%
1 163872
 
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1017209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 853337
83.9%
1 163872
 
16.1%

Interactions

2023-09-22T19:42:02.854706image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:45.509375image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:47.762898image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:49.895384image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:51.960978image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:54.127744image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:56.317628image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:58.489804image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:00.580592image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:03.137173image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:45.737815image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:48.014700image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:50.137475image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:52.186048image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:54.352618image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:56.521759image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:58.735853image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:00.792363image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:03.362067image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:45.968130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:48.254834image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:50.402531image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:52.415225image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:54.594807image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:56.773651image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:58.985922image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:01.072078image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:03.561090image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:46.209074image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:48.475997image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:50.612427image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:52.615356image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:54.849151image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:57.013862image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:59.199595image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:01.436245image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:03.809653image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:46.648037image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:48.729550image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:50.873692image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:52.840904image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:55.107791image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:57.237209image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:59.429234image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:01.681612image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:04.061514image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:46.880536image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:48.960832image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:51.099969image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:53.166812image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:55.373343image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:57.518994image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:59.659044image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:01.956129image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:04.300415image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:47.092041image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:49.190351image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:51.316514image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:53.412730image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:55.617863image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:57.752707image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:59.893999image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:02.194568image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:04.550026image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:47.321440image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:49.427610image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:51.526735image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:53.628238image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:55.867216image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:57.995799image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:00.150514image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:02.419042image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:04.784301image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:47.534744image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:49.639130image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:51.734715image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:53.889322image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:56.091166image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:41:58.213468image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:00.371246image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2023-09-22T19:42:02.628456image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2023-09-22T19:42:12.877721image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
storeday_of_weeksalescustomerscompetition_distancecompetition_open_since_monthcompetition_open_since_yearpromo2_since_weekpromo2_since_yearopenpromostate_holidayschool_holidaystore_typeassortmentpromo2month_mapis_promo
store1.000-0.0000.0010.022-0.045-0.0330.0020.0040.0080.0060.0000.0000.0000.0950.1110.0720.0000.036
day_of_week-0.0001.000-0.451-0.431-0.000-0.0020.000-0.0020.0010.8760.4950.1200.2640.0000.0000.0000.0310.011
sales0.001-0.4511.0000.903-0.026-0.0020.0290.0560.0630.7050.4530.1550.0830.1080.0810.1040.0440.051
customers0.022-0.4310.9031.000-0.1780.0020.0150.0320.1220.3270.2790.0710.0480.3180.2710.1820.0240.082
competition_distance-0.045-0.000-0.026-0.1781.000-0.0240.007-0.0140.0290.0130.0000.0000.0020.0450.0630.1590.0000.070
competition_open_since_month-0.033-0.002-0.0020.002-0.0241.000-0.2340.1080.0170.0250.0130.0610.1320.0680.0590.1300.3260.085
competition_open_since_year0.0020.0000.0290.0150.007-0.2341.0000.0170.0310.0000.0000.0010.0020.0540.0780.0540.0020.029
promo2_since_week0.004-0.0020.0560.032-0.0140.1080.0171.000-0.1210.0330.0520.1010.1820.0730.0910.2830.4320.161
promo2_since_year0.0080.0010.0630.1220.0290.0170.031-0.1211.0000.0070.0150.0160.0270.0850.1100.6820.0990.306
open0.0060.8760.7050.3270.0130.0250.0000.0330.0071.0000.2950.3780.0860.0510.0390.0080.0740.000
promo0.0000.4950.4530.2790.0000.0130.0000.0520.0150.2951.0000.0540.0670.0000.0000.0000.0520.005
state_holiday0.0000.1200.1550.0710.0000.0610.0010.1010.0160.3780.0541.0000.2130.0020.0020.0110.2200.027
school_holiday0.0000.2640.0830.0480.0020.1320.0020.1820.0270.0860.0670.2131.0000.0020.0020.0070.3850.027
store_type0.0950.0000.1080.3180.0450.0680.0540.0730.0850.0510.0000.0020.0021.0000.5370.1060.0070.045
assortment0.1110.0000.0810.2710.0630.0590.0780.0910.1100.0390.0000.0020.0020.5371.0000.0150.0050.008
promo20.0720.0000.1040.1820.1590.1300.0540.2830.6820.0080.0000.0110.0070.1060.0151.0000.0280.438
month_map0.0000.0310.0440.0240.0000.3260.0020.4320.0990.0740.0520.2200.3850.0070.0050.0281.0000.274
is_promo0.0360.0110.0510.0820.0700.0850.0290.1610.3060.0000.0050.0270.0270.0450.0080.4380.2741.000

Missing values

2023-09-22T19:42:05.441786image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-22T19:42:06.992279image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

storeday_of_weekdatesalescustomersopenpromostate_holidayschool_holidaystore_typeassortmentcompetition_distancecompetition_open_since_monthcompetition_open_since_yearpromo2promo2_since_weekpromo2_since_yearpromo_intervalmonth_mapis_promo
0152015-07-3152635551101ca1270.0009200803120150Jul0
1252015-07-3160646251101aa570.0001120071132010Jan,Apr,Jul,OctJul1
2352015-07-3183148211101aa14130.0001220061142011Jan,Apr,Jul,OctJul1
3452015-07-311399514981101cc620.0009200903120150Jul0
4552015-07-3148225591101aa29910.0004201503120150Jul0
5652015-07-3156515891101aa310.00012201303120150Jul0
6752015-07-311534414141101ac24000.0004201303120150Jul0
7852015-07-3184928331101aa7520.00010201403120150Jul0
8952015-07-3185656871101ac2030.0008200003120150Jul0
91052015-07-3171856811101aa3160.0009200903120150Jul0
storeday_of_weekdatesalescustomersopenpromostate_holidayschool_holidaystore_typeassortmentcompetition_distancecompetition_open_since_monthcompetition_open_since_yearpromo2promo2_since_weekpromo2_since_yearpromo_intervalmonth_mapis_promo
1017199110622013-01-010000a1ac5330.000920111312013Jan,Apr,Jul,OctJan1
1017200110722013-01-010000a1aa1400.000620121132010Jan,Apr,Jul,OctJan1
1017201110822013-01-010000a1aa540.000420040120130Jan0
1017202110922013-01-010000a1ca3490.000420111222012Jan,Apr,Jul,OctJan1
1017203111022013-01-010000a1cc900.000920100120130Jan0
1017204111122013-01-010000a1aa1900.000620141312013Jan,Apr,Jul,OctJan1
1017205111222013-01-010000a1cc1880.000420060120130Jan0
1017206111322013-01-010000a1ac9260.000120130120130Jan0
1017207111422013-01-010000a1ac870.000120130120130Jan0
1017208111522013-01-010000a1dc5350.000120131222012Mar,Jun,Sept,DecJan0